ALPRD: assert patch dtype instead of casting#8611
Conversation
Merging this PR will improve performance by 25.3%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
27 µs | 15.8 µs | +70.64% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
215.3 ns | 186.1 ns | +15.67% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
275.6 ns | 246.4 ns | +11.84% |
| ⚡ | Simulation | compact_sliced[(4096, 90)] |
838.1 ns | 750.6 ns | +11.66% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing claude/nice-hypatia-g68zgt-alp-patches (174c599) with claude/nice-hypatia-g68zgt-validate-ctx (7635ae6)
Footnotes
-
10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
7635ae6 to
c4687a7
Compare
ALPRD patch values are always stored as the non-nullable left-parts dtype, so the runtime cast in `canonicalize_patches` (which required an `ExecutionCtx` to materialise a lazy cast and an `all_valid` check) was unnecessary. - Replace `ALPRDData::canonicalize_patches(left_parts, patches, ctx)` with a ctx-free `validate_patches(left_parts, patches)` that asserts the patch values already have the non-nullable left-parts dtype (a non-nullable dtype also guarantees the values contain no nulls, so no `all_valid` execution is needed). Resolves the `TODO(ngates)/TODO(joe): assert the DType, don't cast`. - `validate_parts` likewise asserts the exact non-nullable patch dtype instead of `eq_ignore_nullability` + an `all_valid(LEGACY_SESSION)` execution. - `deserialize` no longer mints a `LEGACY_SESSION` ctx for patch handling. - `take`: `Patches::take` ignores null indices but reports a nullable dtype; drop the spurious nullability with a lazy `cast_values` (no execution at construction) so the patch values satisfy the non-nullable invariant. Net: removes 3 `LEGACY_SESSION` uses from ALPRD construction/validation. ALP already asserts its patch dtype (its patch values are the original floats), so no change was needed there. Signed-off-by: Robert <robert@spiraldb.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Mkrgj5SuJpaKBXK9jydtSc
Per review: patch values already share the left-parts dtype, so drop the `cast_values` in `take` entirely. Instead assert the invariant directly: patch dtype matches left parts (ignoring nullability) and the values are `definitely_no_nulls` (a static check, no execution). `Patches::take`/`take_search`/`take_map` previously gave the taken value array a nullable validity inherited from the take indices even when nulls were excluded (`include_nulls = false`). Since no null take-indices emit a value in that mode, the value array is now marked `Validity::NonNullable`, so taking a non-nullable patch values array yields non-nullable values and satisfies `definitely_no_nulls` without any cast or execution. Signed-off-by: Robert <robert@spiraldb.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Mkrgj5SuJpaKBXK9jydtSc
Apply the same treatment as ALPRD to ALP's compute paths: patch values are concrete floats with no nulls, so the `cast_values` in `take`/`mask` (which only adjusted nullability to match the taken/masked array) is unnecessary. - `validate_patches` now asserts the patch dtype matches the float type ignoring nullability, plus `definitely_no_nulls` (a static check), instead of requiring an exact nullability match. - `take`/`mask` drop `cast_values` and use the `Patches::take`/`mask` output directly (both preserve the values' no-null property). - Update `test_mask_alp_with_patches_*` to assert the surviving patch values remain null-free rather than being cast to nullable. Signed-off-by: Robert <robert@spiraldb.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Mkrgj5SuJpaKBXK9jydtSc
Now that `ALPRD::validate_patches` asserts the patch dtype and `definitely_no_nulls` statically (no execution), `ALPRD::try_new` no longer needs an `ExecutionCtx`. Remove the parameter and propagate the removal up through `RDEncoder::encode`/`encode_generic`, the compute kernels (`take`/`filter`/`mask`), btrblocks ALPRD compression, the compat-gen fixtures, and the benches. This also lets `MaskReduce::mask` (whose trait signature lacks an `ExecutionCtx`) stop minting a `LEGACY_SESSION` execution context at the trait boundary. Signed-off-by: Robert <robert@spiraldb.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Mkrgj5SuJpaKBXK9jydtSc
174c599 to
957ae64
Compare
Stacked on #8610 (which is stacked on #8609).
ALPRD patch values are always stored as the non-nullable left-parts dtype, so the runtime cast in
canonicalize_patches— which needed anExecutionCtxto materialise a lazy cast plus anall_validcheck — was unnecessary. This resolves the existingTODO(ngates)/TODO(joe): assert the DType, don't cast it.Changes
ALPRDData::canonicalize_patches(left_parts, patches, ctx)→ ctx-freevalidate_patches(left_parts, patches)that asserts the patch values already have the non-nullable left-parts dtype (a non-nullable dtype also guarantees no nulls, so theall_validexecution is dropped).validate_partsnow asserts the exact non-nullable patch dtype instead ofeq_ignore_nullability+all_valid(LEGACY_SESSION).deserializeno longer mints aLEGACY_SESSIONctx for patch handling.take:Patches::takeignores null indices but reports a nullable dtype, so a lazycast_valuesdrops the spurious nullability (no execution at construction; materialised later during decode).Net: removes 3
LEGACY_SESSIONuses from ALPRD construction/validation, with no execution during construction.ALP needed no change: its patch values are the original floats and its
validate_patchesalready asserts the dtype (no cast).ALP::try_new/ALPRD::try_newpublic signatures are unchanged, so there is no caller cascade (CUDA/bench call sites are untouched).Verification
cargo build/clippy/test -p vortex-alp✅ (173 tests, incl.take_with_nullsand take/decode conformance);clippy -D warningsclean;cargo +nightly fmt --checkclean.cargo build --all-targets -p vortex-compressor -p vortex-btrblocks✅ (ALP consumers).🤖 Generated with Claude Code
https://claude.ai/code/session_01Mkrgj5SuJpaKBXK9jydtSc
Generated by Claude Code